Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go: Performance: Don't compile regex on matcher create #107

Merged
merged 10 commits into from
Aug 10, 2023

Conversation

tigh-latte
Copy link
Member

@tigh-latte tigh-latte commented Apr 19, 2023

🤔 What's changed?

I was writing some benchmarks for cucumber and noticed that ParseFromBytes was allocating a lot of memory. A brief venture down the rabbit hole and I found one of the contributors was regexp.MustCompile being called every time a matcher was created.

Moving the language pattern to the global space improved performance by 85% on my machine:

> benchstat tighbench/final/matcher_old.txt tighbench/final/matcher_new.txt
goos: linux
goarch: amd64
cpu: AMD Ryzen 9 5950X 16-Core Processor
                                    │ tighbench/final/matcher_old.txt │   tighbench/final/matcher_new.txt   │
                                    │             sec/op              │   sec/op     vs base                │
_ParseGherkinDocumentForLanguage-32                      14.208µ ± 6%   2.188µ ± 5%  -84.60% (p=0.000 n=10)

                                    │ tighbench/final/matcher_old.txt │   tighbench/final/matcher_new.txt    │
                                    │              B/op               │     B/op      vs base                │
_ParseGherkinDocumentForLanguage-32                     15.414Ki ± 0%   5.398Ki ± 0%  -64.98% (p=0.000 n=10)

                                    │ tighbench/final/matcher_old.txt │  tighbench/final/matcher_new.txt   │
                                    │            allocs/op            │ allocs/op   vs base                │
_ParseGherkinDocumentForLanguage-32                       168.00 ± 0%   24.00 ± 0%  -85.71% (p=0.000 n=10)

Also, the inline regexp.MustCompile call for tag matching were also expensive to compile and execute. Moving one to top level and removing the other completely for a simple strings.ContainsAny saw performance of that function improve by over 80%:

> benchstat tighbench/final/tag_old.txt tighbench/final/tag_new_regex.txt
goos: linux
goarch: amd64
cpu: AMD Ryzen 9 5950X 16-Core Processor
                        │ tighbench/final/tag_old.txt │  tighbench/final/tag_new_regex.txt  │
                        │           sec/op            │   sec/op     vs base                │
Matcher_MatchTagLine-32                 12.686µ ± 12%   2.426µ ± 6%  -80.88% (p=0.000 n=10)

                        │ tighbench/final/tag_old.txt │  tighbench/final/tag_new_regex.txt   │
                        │            B/op             │     B/op      vs base                │
Matcher_MatchTagLine-32                  9.013Ki ± 0%   1.034Ki ± 0%  -88.53% (p=0.000 n=10)

                        │ tighbench/final/tag_old.txt │ tighbench/final/tag_new_regex.txt  │
                        │          allocs/op          │ allocs/op   vs base                │
Matcher_MatchTagLine-32                   131.00 ± 0%   20.00 ± 0%  -84.73% (p=0.000 n=10)

⚡️ What's your motivation?

May as well make code faster if we can.

🏷️ What kind of change is this?

  • 🏦 Refactoring/debt/DX (improvement to code design, tooling, documentation etc. without changing behaviour)

♻️ Anything particular you want feedback on?

I have had a very hard time testing these changes due to a lack of unit testing in this library, and I can't get the godog tests running on my machine anymore for whatever reason. If someone could help me to make sure I've broke nothing I'd appreciate it.

📋 Checklist:

  • I agree to respect and uphold the Cucumber Community Code of Conduct
  • I've changed the behaviour of the code
    • I have added/updated tests to cover my changes.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.
  • Users should know about my change
    • I have added an entry to the "Unreleased" section of the CHANGELOG, linking to this pull request.

This text was originally generated from a template, then edited by hand. You can modify the template here.

@mpkorstanje
Copy link
Contributor

I've kicked of CI for you. But you may want to check the #commiters and/or #commiters-go channels on Slack for someone with a bit more Go knowledge than me.

@tigh-latte
Copy link
Member Author

@mpkorstanje looks like the secret was make acceptance. That found a problem that I sorted there.

Copy link
Contributor

@mpkorstanje mpkorstanje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Please do update the CHANGELOG.md with a ### Fixed entry.

Also one question about dependencies jumping around.

go/go.mod Outdated Show resolved Hide resolved
go/go.mod Outdated Show resolved Hide resolved
Copy link
Contributor

@dolmen dolmen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove changes in go.mod/go.sum (because the fix is in #133) and this will be good.

@mpkorstanje mpkorstanje self-assigned this Aug 8, 2023
@mpkorstanje mpkorstanje merged commit b107924 into cucumber:main Aug 10, 2023
@aslakhellesoy
Copy link
Contributor

Hi @tigh-latte,

Thanks for your making your first contribution to Cucumber, and welcome to the Cucumber committers team! You can now push directly to this repo and all other repos under the cucumber organization! 🍾

In return for this generous offer we hope you will:

  • ✅ Continue to use branches and pull requests. When someone on the core team approves a pull request (yours or someone else's), you're welcome to merge it yourself.
  • 💚 Commit to setting a good example by following and upholding our code of conduct in your interactions with other collaborators and users.
  • 💬 Join the community Slack channel to meet the rest of the team and make yourself at home.
  • ℹ️ Don't feel obliged to help, just do what you can if you have the time and the energy.
  • 🙋 Ask if you need anything. We're looking for feedback about how to make the project more welcoming, so please tell us!

On behalf of the Cucumber core team,
Aslak Hellesøy
Creator of Cucumber

@mpkorstanje
Copy link
Contributor

@tigh-latte that took a while! But cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants